Best, worst and average case

In computer science, best, worst, and average cases of a given algorithm express what the resource usage is at least, at most and on average, respectively. Usually the resource being considered is running time, i.e. time complexity, but could also be memory or some other resource. Best case is the function which performs the minimum number of steps on input data of n elements. Worst case is the function which performs the maximum number of steps on input data of size n. Average case is the function which performs an average number of steps on input data of n elements.

In real-time computing, the worst-case execution time is often of particular concern since it is important to know how much time might be needed in the worst case to guarantee that the algorithm will always finish on time.

Average performance and worst-case performance are the most used in algorithm analysis. Less widely found is best-case performance, but it does have uses: for example, where the best cases of individual tasks are known, they can be used to improve the accuracy of an overall worst-case analysis. Computer scientists use probabilistic analysis techniques, especially expected value, to determine expected running times.

The terms are used in other contexts; for example the worst- and best-case outcome of an epidemic, worst-case temperature to which an electronic circuit element is exposed, etc. Where components of specified tolerance are used, devices must be designed to work properly with the worst-case combination of tolerances and external conditions.

Best-case performance for algorithm

The term best-case performance is used in computer science to describe an algorithm's behavior under optimal conditions. For example, the best case for a simple linear search on a list occurs when the desired element is the first element of the list.

Development and choice of algorithms is rarely based on best-case performance: most academic and commercial enterprises are more interested in improving average-case complexity and worst-case performance. Algorithms may also be trivially modified to have good best-case running time by hard-coding solutions to a finite set of inputs, making the measure almost meaningless.^[1]

Worst-case versus amortized versus average-case performance

Worst-case performance analysis and average-case performance analysis have some similarities, but in practice usually require different tools and approaches.

Determining what typical input means is difficult, and often that average input has properties which make it difficult to characterise mathematically (consider, for instance, algorithms that are designed to operate on strings of text). Similarly, even when a sensible description of a particular "average case" (which will probably only be applicable for some uses of the algorithm) is possible, they tend to result in more difficult analysis of equations.^[2]

Worst-case analysis gives a safe analysis (the worst case is never underestimated), but one which can be overly pessimistic, since there may be no (realistic) input that would take this many steps.

In some situations it may be necessary to use a pessimistic analysis in order to guarantee safety. Often however, a pessimistic analysis may be too pessimistic, so an analysis that gets closer to the real value but may be optimistic (perhaps with some known low probability of failure) can be a much more practical approach. One modern approach in academic theory to bridge the gap between worst-case and average-case analysis is called smoothed analysis.

When analyzing algorithms which often take a small time to complete, but periodically require a much larger time, amortized analysis can be used to determine the worst-case running time over a (possibly infinite) series of operations. This amortized cost can be much closer to the average cost, while still providing a guaranteed upper limit on the running time. So e.g. online algorithms are frequently based on amortized analysis.

The worst-case analysis is related to the worst-case complexity.^[3]

Practical consequences

Many algorithms with bad worst-case performance have good average-case performance. For problems we want to solve, this is a good thing: we can hope that the particular instances we care about are average. For cryptography, this is very bad: we want typical instances of a cryptographic problem to be hard. Here methods like random self-reducibility can be used for some specific problems to show that the worst case is no harder than the average case, or, equivalently, that the average case is no easier than the worst case.

On the other hand, some data structures like hash tables have very poor worst-case behaviors, but a well written hash table of sufficient size will statistically never give the worst case; the average number of operations performed follows an exponential decay curve, and so the run time of an operation is statistically bounded.

Examples

Sorting algorithms

Algorithm	Data structure	Time complexity:Best	Time complexity:Average	Time complexity:Worst	Space complexity:Worst
Quick sort	Array	O(n log(n))	O(n log(n))	O(n²)	O(n)
Merge sort	Array	O(n log(n))	O(n log(n))	O(n log(n))	O(n)
Heap sort	Array	O(n log(n))	O(n log(n))	O(n log(n))	O(1)
Smooth sort	Array	O(n)	O(n log(n))	O(n log(n))	O(1)
Bubble sort	Array	O(n)	O(n²)	O(n²)	O(1)
Insertion sort	Array	O(n)	O(n²)	O(n²)	O(1)
Selection sort	Array	O(n²)	O(n²)	O(n²)	O(1)
Bogo sort	Array	O(n)	O(n n!)	O(∞)	O(1)

Insertion sort applied to a list of n elements, assumed to be all different and initially in random order. On average, half the elements in a list A₁ ... A_j are less than element A_j+1, and half are greater. Therefore, the algorithm compares the (j + 1)^th element to be inserted on the average with half the already sorted sub-list, so t_j = j/2. Working out the resulting average-case running time yields a quadratic function of the input size, just like the worst-case running time.
Quicksort applied to a list of n elements, again assumed to be all different and initially in random order. This popular sorting algorithm has an average-case performance of O(n log(n)), which contributes to making it a very fast algorithm in practice. But given a worst-case input, its performance degrades to O(n²). Also, when implemented with the "shortest first" policy, the worst-case space complexity is instead bounded by O(log(n)).
Heapsort has O(n) time when all elements are the same. Heapify takes O(n) time and then removing elements from the heap is O(1) time for each of the n elements. The run time grows to O(nlog(n)) if all elements must be distinct.
Bogosort has O(n) time when the elements are sorted on the first iteration. In each iteration all elements are checked if in order. There are n! possible permutations; with a balanced random number generator, almost each permutation of the array is yielded in n! iterations. Computers have limited memory, so the generated numbers cycle; it might not be possible to reach each permutation. In the worst case this leads to O(∞) time, an infinite loop.

Data structures

Data structure	Time complexity								Space complexity
Data structure	Avg: Indexing	Avg: Search	Avg: Insertion	Avg: Deletion	Worst: Indexing	Worst: Search	Worst: Insertion	Worst: Deletion	Worst
Basic array	O(1)	O(n)	O(n)	O(n)	O(1)	O(n)	O(n)	O(n)	O(n)
Dynamic array	O(1)	O(n)	O(n)	—	O(1)	O(n)	O(n)	—	O(n)
Stack	O(n)	O(n)	O(1)	O(1)	O(n)	O(n)	O(1)	O(1)	O(n)
Queue	O(n)	O(n)	O(1)	O(1)	O(n)	O(n)	O(1)	O(1)	O(n)
Singly linked list	O(n)	O(n)	O(1)	O(1)	O(n)	O(n)	O(1)	O(1)	O(n)
Doubly linked list	O(n)	O(n)	O(1)	O(1)	O(n)	O(n)	O(1)	O(1)	O(n)
Skip list	O(log (n))	O(log (n))	O(log (n))	O(log (n))	O(n)	O(n)	O(n)	O(n)	O(nlog (n))
Hash table	—	O(1)	O(1)	O(1)	—	O(n)	O(n)	O(n)	O(n)
Binary search tree	O(log (n))	O(log (n))	O(log (n))	O(log (n))	O(n)	O(n)	O(n)	O(n)	O(n)
Cartesian tree	—	O(log (n))	O(log (n))	O(log (n))	—	O(n)	O(n)	O(n)	O(n)
B-tree	O(log (n))	O(log (n))	O(log (n))	O(log (n))	O(log (n))	O(log (n))	O(log (n))	O(log (n))	O(n)
Red–black tree	O(log (n))	O(log (n))	O(log (n))	O(log (n))	O(log (n))	O(log (n))	O(log (n))	O(log (n))	O(n)
Splay tree	—	O(log (n))	O(log (n))	O(log (n))	—	O(log (n))	O(log (n))	O(log (n))	O(n)
AVL tree	O(log (n))	O(log (n))	O(log (n))	O(log (n))	O(log (n))	O(log (n))	O(log (n))	O(log (n))	O(n)
K-d tree	O(log (n))	O(log (n))	O(log (n))	O(log (n))	O(n)	O(n)	O(n)	O(n)	O(n)

Linear search on a list of n elements. In the absolute worst case, the search must visit every element once. This happens when the value being searched for is either the last element in the list, or is not in the list. However, on average, assuming the value searched for is in the list and each list element is equally likely to be the value searched for, the search visits only n/2 elements.

References

^ Introduction to Algorithms (Cormen, Leiserson, Rivest, and Stein) 2001, Chapter 2 "Getting Started".In Best-case complexity, it gives the lower bound on the running time of the algorithm of any instances of input.
^ Spielman, Daniel; Teng, Shang-Hua (2009), "Smoothed analysis: an attempt to explain the behavior of algorithms in practice" (PDF), Communications of the ACM, 52 (10), ACM: 76-84, doi:10.1145/1562764.1562785, S2CID 7904807
^ "Worst-case complexity" (PDF). Archived (PDF) from the original on 2011-07-21. Retrieved 2008-11-30.

[1] Introduction to Algorithms (Cormen, Leiserson, Rivest, and Stein) 2001, Chapter 2 "Getting Started".In Best-case complexity, it gives the lower bound on the running time of the algorithm of any instances of input.

[2] Spielman, Daniel; Teng, Shang-Hua (2009), "Smoothed analysis: an attempt to explain the behavior of algorithms in practice" (PDF), Communications of the ACM, 52 (10), ACM: 76-84, doi:10.1145/1562764.1562785, S2CID 7904807

[3] "Worst-case complexity" (PDF). Archived (PDF) from the original on 2011-07-21. Retrieved 2008-11-30.

[1]

[2]

[3]